RAW SEQUENCE LISTING 
ERROR REPORT 




BIOTECHNOLOGY 
SYSTEMS- 
BRANCH 




The Biotechnology Systems Branch of the Scientific and Technical Information 
Ceriter^(STIC) detected errors when processing the following computer readable 
form: 




Application Serial Number: 
Source: 

Date Processed by STIC: 



THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

pi paqp FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

5 f^TO^^V O^B PWNTOUT IN YOUR NEXT COMMUNICATION TO THE 

APPI ICANT WITH A NOTICE TO COMPLY or, 
2) ^^WG APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 

FOR^SlS^QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX 703-308-4210. 
PATENTIN 2.1 c-mail help: p*""™ lieln@usnto.gov or phone 703-306-41 19 (R. Wax) 
PATENT IN 3.0 c-mail help: patin3lie.ln@usnto.gov or phone 703-306-41 19 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VFRSTON 3.0 PROGRAM . ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 



Checker Version 3.0 ' - . - _ 

the Checker Version 3.0 application is a state-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is in 
compliance with format and content rules. Checker Version 3 0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1. 1990 (old 
rules) and the revised version (new rules) effect.ve July 1, 1998 as well as World Intellectual 
Property Organization (WlPO) Standard ST 25 

Checker Version 3.0 replaces the previous DOS-based version of Checker, and is Y2K- • 
compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTO). v 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 

rwk,r Version VP on h* down InaHod from the U *PTO website at the following address: 

http://www.uspto.gov/wcb/officcs/pac/chcckcr- 



Raw Sequence Listing Error Summary 



ERROR DETECTED SUCCESTED CORRECTION . SERIAL NUMBER: $yT f/?A/) * 

ATTN: NEW RULES CASES: F LEASE DISRECARD ENCLISM "ALPHA* HEADERS, WHICH WERE INSERTED DVT 

I Wrapped Nucleic* The numberAext at the end of each line "wrapped" down to the next line. Jhb may occur if your file 

Wrapped Aminos was retrieved in a word processor after creating H. Please adjust your right ma/gin to J; this will 
prevent **wrapping." . 



* J Invalid Line Length The rulci require that a line not eiceed 72 characters in length, this includci white spaces. 

3 Misaligned Amino The numbering under eich 3* amino acid is misaligned. Do not use Ub codes between numbers; 

Numbering use space characters, instead. 

4 Non-ASCII The submitted file was not saved in ASCI I(DOS) text, as rr^atfrci by the Sequence Rules. Tlcaje 

ensure your subsequent submission b savtd In ASCII left. 

• t • * • i 
\ ■ i \ 

Seoucnccfs) conUtn n's or Xaa*s representing more lhah one residue. Per Sequence Rules, 
each n or Xaa can on!/ represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug** in Patentln version 2.(Thas c^sed uTe <220>*O 23 > section to be mil sing from amino acid 

sequencers) . Normally, Patentln wouldautornau'calf^ this section from the 

previously coded nucleic acid sequence. Please rninuslly copy lh< relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Aril flcUI or Unknown sequences. 

_Skipped Sequences Sequence^) missing. If intentional, please insert the following lines for each skipped sequence 

(OLD RULES) (2) rN FORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 

(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 

(xi) SEQUENCE DESCRi PTION.S EQ ID NO:X: (insert SEQ ID.NO where "X" is shown) 
This sequence is intentionally skipped 

, * ' . Please also adjust the "(ii) NUMDER OF SEQUENCES:" response lo Include the skipped sequences. 



Variable Length. 



_PatcntIn 2.0 
*1>ug M 



12 



13 



Skipped Sequences' Sequencers) 
"(NEW RULES) 



missing. If Intentional, please insert fhe following lines for cacli skipped sequent 



Use of n "i or Xaa i 




10 ^Invalid <2I3> 
Response 

Use of<220> 



<2I0> sequence id number 
«00> sequence id number 
000 

Use ofn'i and/or Xaa s have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 

In <220> to <723> section, please explain location of n or Xaa, and which residue n or Xaa represents 

Per 1.823 of Sequence Rules, the only valid <213> responses a/c: Unknown, Artificial Sequence, or 
scientific'name (Gcnus/spocics). <220>*<223> section is required when <2 13> response is Unknown 
is Artificial Sequence 



Sequencers) _ 



_ missing the <220> "Fcilurc" and associated numeric identifiers and responses. 



Patentln 2.0 
"bug" 



Misuse of n 



Use of <220>lo <223>is MANDATORY if <2I3> "Organism" response is "Artificial Sequence" or 
"Urifcno wn:"^Ple as e explain sourceof geneticmaterial in <?20> t6/<223 > section; ~~ 
(See "Federal Regrstcr." 06/0 1/1998. Vol. 63. No. 104. pp. 2963 1-32) (Sec. 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means lo copy file to floppy disk. 

' h can only be used to represent a single nucleotide in a nucleic acid sequence. N is not used to rcprcco 
any value not specifically a nucleotide. 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/981,900 



DATE: 11/08/2001 
TIME: 13:18:37 



Input Set : A:\es.txt 

Output Set: N:\CRF3\11082001\I981900.raW 



C- 
C- 



3 
4 
5 
7 

DEGRADE 
8 
10 
-> 12 
-> 12 
12 
13 
15 
17 
19 
20 
21 
22 
24 
25 
27 
29 
31 
33 
35 
37 
39 
41 
43 
45 
47 
49 
51 
-„.5.3- 
55 
57 
59 
61 
64 
65 
66 
67 
69 
71 
72 
75 
76 
79 
80 



— Does Not Comply 
©©meted Diskette-Weeded 



<110> APPLICANT: Sticklen, Masomeh B 

Maqbool, Shahina B ' t > 

Dale, Bruce E M * 
<120> TITLE OF INVENTION: TRANSGENIC PLANTS CONTAINING LIGNINASE AND CELLULASE WHICH 
LIGNIN 

AND CELLULOSE TO FERMENTABLE SUGARS 
<130> FILE REFERENCE: MSU 4.1-539 

<140> CURRENT APPLICATION NUMBER: US/09/981, 900^ 
<141> CURRENT FILING DATE: 2001-10-18 0 ^ I 

<150> PRIOR APPLICATION NUMBER: 60/242,408 
<151> PRIOR FILING DATE: 2000-10-20 
<160> NUMBER OF SEQ ID NOS : 22 
<170> SOFTWARE: Patentln version 3.1 
<210> SEQ ID NO: 1 
<211> LENGTH: 1110 
<212> TYPE: DNA 
<213> ORGANISM: Oryza sativa 
<400> SEQUENCE: 1 

gggtcggaga tgccaccacg gccacaaccc acgagcccgg cgcgacacca ccgcgcgcgt 
tgagccagcc acaaacgccc gcggataggc gcgccgcacg cggccaatcc taccacatcc 
ccggcctccg cggctcgagc gccgtgccat ccgatccgct gagttttggc tatttatacg 
taccgcggga gcctgtgtgc agagagtgca tctcaagaag 
gagcttggtg agctgcagag atggccccct ccgtgatggc 
ctcccttcca gggctcaagt ccaccgccgg catgccgtcg 



ttcggcaacg tcagcatggc ggcaggatca ggtgcatgca 

acacacattc ttcttcttct tcttcttctt aaccaacatt 

tttattcatt gaggtgtggc cgattgaggg catcaagaag 

gccaccgctc accgtggagg acctcctgaa gcagatcgag 

ggtgccctgc ctcgagttca gcaaggtcgg atttgtctac 

tggatactac gacggcaggt actggaccat gtggaagctg 

cgccacccag gtcgtcaagg agctcgagga ggccaagaag 

ccgtatcatc ggcttcgaca acgttaggca ggtgcagctc 
-ccc gg gc t g c„ jg a g g a g„t atg__g t g,gc aacta _a_g.c c g tc ate 

attgttcatc tctgattcga tgatgtctcc caccttgttt 

categtcttt tgattttacc ggccgtgctc tgcttttgtt 

ctctctgact tgatgtaaga gtggtatctg ctacgactat 

atgtgaatga aatctatgaa agctccggct 
<210> SEQ ID NO: 2 
<211> LENGTH: 38 
<212> TYPE: PRT 

<213> ORGANISM: Oryza sativa 
<4 00> SEQUENCE: 2 

Met Ala Pro Ser Val Met Ala Ser Ser Ala Thr 

1 5 10 

Gin Gly Ser Ser Pro Pro Pro Ala Cys Arg Arg 



cgcgacacca 
cggccaatcc 
gagttttggc 
tactcgagca 
gtcgtcggcc 
cccgccgtcc 
ggtaattacc 
aaccaacaac 
ttcgagaccc 
tacctagctc 
cgtgagaacc 
cccatgttcg 
gcgtaccctg 
atcagcttca 
gtcata tat a 
cgtgtgttcc 
ttttcttttc 
atgttgtttg 



aagaaggaga 
accaccgtcg 
gaactccagc 
tactgatcca 
tcaattatcg 
tctcctacct 
cgttccaagt 
acaagtcccc 
ggtgcaccga 
atgcattcgt 
tcgcctacaa 
gectegttta 
cagtttgttt 
acctgattct 
ggtgaggcat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
~ 960" 
1020 
1080 
1110 



20 

Gin Leu Arg Gin Arg Gin 
35 



25 



Thr Val Ala Pro Phe 
15 

Pro Pro Ser Glu Leu 
30 
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RAW SEQUENCE LISTING DATE: 11/08/2001 

PATENT APPLICATION: US/09/981,900 TIME: 13:18:37 

Input Set : A:\es.txt 

Output Set: N:\CRF3\11082001\I981900.raw 

83 <210> SEQ ID NO: 3 

84 <211> LENGTH: 6 i - id ir^ AM 1 ft A < * *^ 

85 <212> TYPE: PHT „ ^Vf^ tO^'C^' ' 

86 <213> ORGAN I S M >^"^*^^ ^ HHq ^ f "* --^^^ A^tf*" 
88 <220> FEATURE 




<*t^*' X ***** V,^ 



89 <221> NAME/KEYT~^lGNAL ^ * y^v^ V W 

90 <222> LOCATION: (1)..(6) ^\m. ^ ^ 

91 <223> OTHER INFORMATION: targets the peroxisomes of plants 
94 <400> SEQUENCE: 3 

96 Arg Ala Val Ala Arg Leu 

97 1 5 

100 <210> SEQ ID NO: 4 

101 <211> LENGTH: 3004 

102 <212> TYPE: DNA 

103 <213> ORGANISM: Acidothermus cellulolyticus 

105 <220> FEATURE: 

106 <221> NAME/KEY: CDS 

107 <222> LOCATION: ( 824 )..( 2512 ) 

108 <223> OTHER INFORMATION: E I beta- 1 , 4 -endoglucanase precursor 

111 <400> SEQUENCE: 4 

112 ggatccacgt tgtacaaggt cacctgtccg tcgttctggt agagcggcgg gatggtcacc 60 
114 cgcacgatct ctcctttgtt gatgtcgacg gtcacgtggt tacggtttgc ctcggccgcg 120 
116 attttcgcgc tcgggcttgc tccggctgtc gggttcggtt tggcgtggtg tgcggagcac 180 
118 gccgaggcga tcccaatgag ggcaagggca agagcggagc cgatggcacg tcgggtggcc 240 
120 gatggggtac gccgatgggg cgtggcgtcc ccgccgcgga cagaaccgga tgcggaatag 300 
122 gtcacggtgc gacatgttgc cgtaccgcgg acccggatga caagggtggg tgcgcgggtc 360 
124 gcctgtgagc tgccggctgg cgtctggatc atgggaacga tcccaccatt ccccgcaatc 420 
126 gacgcgatcg ggagcagggc ggcgcgagcc ggaccgtgtg gtcgagccgg acgattcgcc 480 
128 catacggtg'c tgcaatgccc agcgccatgt tgtcaatccg ccaaatgcag caatgcacac 540 
130 atggacaggg attgtgactc tgagtaatga ttggattgcc ttcttgccgc ctacgcgtta 600 
132 cgcagagtag gcgactgtat gcggtaggtt ggcgctccag ccgtgggctg gacatgcctg 660 
134 ctgcgaactc ttgacacgtc tggttgaacg cgcaatactc ccaacaccga tgggatcgtt 720 
136 cccataagtt tccgtctcac aacagaatcg gtgcgccctc atgatcaacg tgaaaggagt 780 

138 acgggggaga acagacgggg gagaaaccaa cgggggattg gcg gtg ccg cgc gca 835 

139 Val Pro Arg Ala 

140 1 

14 2 ttg egg cga gtg cct ggc teg egg gtg atg ctg egg gtc ggc gtc gte 883 

"143 ~Le~u~Arg"Arg Val^Pro ~G"Iy-Ser-Axg-Vai -Met -Leu— Arg-- Va-1— Gly- Val-- Val- 

144 5 10 15 20 

146 gtc gcg gtg ctg gca ttg gtt gec gca etc gee aac eta gee gtg ccg 931 
14 7 Val Ala Val Leu Ala Leu Val Ala Ala Leu Ala Asn Leu Ala Val Pro 
148 25 30 35 

150 egg ccg get cgc gee gcg ggc ggc ggc tat tgg cac acg age ggc egg 979 

151 Arg Pro Ala Arg Ala Ala Gly Gly Gly Tyr Trp His Thr Ser Gly Arg 

152 40 45 50 

154 gag ate ctg gac gcg aac aac gtg ccg gta egg ate gee ggc ate aac 1027 

155 Glu lie Leu Asp Ala Asn Asn Val Pro Val Arg lie Ala Gly lie Asn 

156 55 60 65 

158 tgg ttt ggg ttc gaa ace tgc aat tac gtc gtg cac ggt etc tgg tea 1075 
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RAW SEQUENCE LISTING DATE: 11/08/2001 

PATENT APPLICATION: US/09/981, 900 TIME: 13:18:37 

Input Set : A:\es.txt 

Output Set: N:\CRF3\11082001\I981900.raw 

159 Trp Phe Gly Phe Glu Thr Cys Asn Tyr Val Val His Gly Leu Trp Ser 

160 70 75 80 

162 cgc gac tac cgc age atg etc gac cag ata aag teg etc ggc tac aac 1123 

163 Arg Asp Tyr Arg Ser Met Leu Asp Gin lie Lys Ser Leu Gly Tyr Asn 

164 85 90 95 100 

166 aca ate egg ctg ccg tac tct gac gac att etc aag ccg ggc acc atg 1171 

167 Thr lie Arg Leu Pro Tyr Ser Asp Asp lie Leu Lys Pro Gly Thr Met 

168 105 110 115 

170 ccg aac age ate aat ttt tac cag atg aat cag gac ctg cag ggt ctg 1219 

171 Pro Asn Ser lie Asn Phe Tyr Gin Met Asn Gin Asp Leu Gin Gly Leu 

172 120 125 130 

174 acg tec ttg cag gtc atg gac aaa ate gtc gcg tac gee ggt cag ate 1267 

175 Thr Ser Leu Gin Val Met Asp Lys lie Val Ala Tyr Ala Gly Gin lie 

176 135 140 145 

178 ggc ctg cgc ate att ctt gac cgc cac cga ccg gat tgc age ggg cag 1315 

179 Gly Leu Arg lie lie Leu Asp Arg His Arg Pro Asp Cys Ser Gly Gin 

180 150 155 160 

182 teg gcg ctg tgg tac acg age age gtc teg gag get acg tgg att tec 1363 

183 Ser Ala Leu Trp Tyr Thr Ser Ser Val Ser Glu Ala Thr Trp lie Ser 

184 165 170 175 180 

186 gac ctg caa gcg ctg gcg cag cgc tac aag gga aac ccg acg gtc gtc 1411 

187 Asp Leu Gin Ala Leu Ala Gin Arg Tyr Lys Gly Asn Pro Thr Val Val 

188 185 190 195 

190 ggc ttt gac ttg cac aac gag ccg cat gac ccg gec tgc tgg ggc tgc 1459 

191 Gly Phe Asp Leu His Asn Glu Pro His Asp Pro Ala Cys Trp Gly Cys 

192 200 205 210 

194 ggc gat ccg age ate gac tgg cga ttg gee gee gag egg gee gga aac 1507 

195 Gly Asp Pro Ser lie Asp Trp Arg Leu Ala Ala Glu Arg Ala Gly Asn 

196 215 220 225 

198 gee gtg etc teg gtg aat ccg aac ctg etc att ttc gtc gaa ggt gtg 1555 

199 Ala Val Leu Ser Val Asn Pro Asn Leu Leu lie Phe Val Glu Gly Val 

200 230 235 240 

202 cag age tac aac gga gac tec tac tgg tgg ggc ggc aac ctg caa gga 1603 

203 Gin Ser Tyr Asn Gly Asp Ser Tyr Trp Trp Gly Gly Asn Leu Gin Gly 

204 245 250 255 260 

206 gee ggc cag tac ccg gtc gtg ctg aac gtg ccg aac cgc ctg gtg tac 1651 

207 Ala Gly Gin Tyr Pro Val Val Leu Asn Val Pro Asn Arg Leu Val Tyr 

208 265 27-0 — — 27-5. 

210 teg gcg cac gac tac gcg acg age gtc tac ccg cag acg tgg ttc age 1699 

211 Ser Ala His Asp Tyr Ala Thr Ser Val Tyr Pro Gin Thr Trp Phe Ser 

212 280 285 290 

214 gat ccg acc ttc ccc aac aac atg ccc ggc ate tgg aac aag aac tgg 1747 

215 Asp Pro Thr Phe Pro Asn Asn Met Pro Gly lie Trp Asn Lys Asn Trp 

216 295 300 305 

218 gga tac etc ttc aat cag aac att gca ccg gta tgg ctg ggc gaa ttc 1795 

219 Gly Tyr Leu Phe Asn Gin Asn lie Ala Pro Val Trp Leu Gly Glu Phe 

220 310 315 320 

222 ggt acg aca ctg caa tec acg acc gac cag acg tgg ctg aag acg etc 1843 

223 Gly Thr Thr Leu Gin Ser Thr Thr Asp Gin Thr Trp Leu Lys Thr Leu 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/981,900 



DATE: 11/08/2001 
TIME: 13:18:37 



Input Set : A:\es.txt 

Output Set: N:\CRF3\11082001\I981900.raw 



224 325 



239 
240 



330 



226 gtc cag tac eta egg ccg acc gcg 

227 Val Gin Tyr Leu Arg Pro Thr Ala 

228 345 

230 tgg acc ttc tgg tec tgg aac ccc 

231 Trp Thr Phe Trp Ser Trp Asn Pro 

232 360 

2 34 etc aag gat gac tgg cag acg gtc 

2 35 Leu Lys Asp Asp Trp Gin Thr Val 
375 



335 

caa tac ggt 
Gin Tyr Gly 

350 
gat tec ggc 
Asp Ser Gly 
365 

gac aca gta 
Asp Thr Val 



236 



380 



2 38 gcg ccg ate aag teg teg att ttc 



Ala Pro lie Lys Ser Ser lie Phe 



390 



395 



242 cct age agt caa ccg tec ccg teg 

243 Pro Ser Ser Gin Pro Ser Pro Ser 



244 405 



410 



246 ccg teg gcg agt egg acg ccg acg 

247 Pro Ser Ala Ser Arg Thr Pro Thr 

248 425 

250 ccg acg cca acg ctg acc cct act 

251 Pro Thr Pro Thr Leu Thr Pro Thr 
440 

ccg acg ccg tea ccg acg gca gee 



252 
254 



255 Pro Thr Pro Ser Pro Thr Ala Ala 



256 



455 



460 



264 485 
266 
267 



2 58 tac cag gtc aac age gat tgg ggc 

259 Tyr Gin Val Asn Ser Asp Trp Gly 

260 470 475 
262 gtg aca aat tec gga tec gtc gcg 
2 63 Val Thr Asn Ser Gly Ser Val Ala 

490 

aca ttc ggc gga aat cag acg att 
Thr Phe Gly Gly Asn Gin Thr lie 
505 

270 acg cag aac ggt cag teg gta acg 

271 Thr Gin Asn Gly Gin Ser Val Thr 

272 520 
27 4~ gtg" Ttt~cag~ cct" ggt~ca~g~ aac - acc 

275 Val He Gin Pro Gly Gin Asn Thr 

276 535 540 

278 acc gga age aac gcg gca ccg aca 

279 Thr Gly Ser Asn Ala Ala Pro Thr 
550 555 

tacgtcgggg agecgaeggg agggtcegga 
cgaacccaac aatccggacg gaactgeagg 
atctcaaaac ggctgegage cggcgtcctc 
gtgccgctgg egatgeagea tcctgccatc 
ggagegaect tettegtcaa cccgtactgg 
agaccaatgc cactctcgca gcgaaaatgc 



268 



280 
282 
284 
286 
288 
290 
292 



gat 


cct 


gtc 


Asp 


Pro 


Val 


gtg 


teg 


ccg 


Val 


Ser 


Pro 






415 


cct 


act 


ccg 


Pro 


Thr 


Pro 




430 




get 


acg 


ccc 


Ala 


Thr 


Pro 


445 






tec 


gga 


gee 


Ser 


Gly 


Ala 


aat 


ggc 


ttc 


Asn 


Gly 


Phe 


acc 


aag 


aca 


Thr 


Lys 


Thr 






495 


acc 


aat 


teg 


Thr 


Asn 


Ser 




510 




get 


egg 


aat 


Ala 


Arg 


Asn 


525 






-acg- 


-ttc- gga- 


Thr 


Phe 


Gly 


gtc 


gee 


tgc 


Val 


Ala 


Cys 



ccgtcggttc 
taccagagag 
geeggggegg 
gccgcgacgc 
gcgcaagaag 
gcgtcgtttc 



340 

gcg gac age ttc cag 
Ala Asp Ser Phe Gin 
355 

gac aca gga gga att 
Asp Thr Gly Gly He 
370 

aaa gac ggc tat etc 
Lys Asp Gly Tyr Leu 
385 

ggc gcg tct gca teg 
Gly Ala Ser Ala Ser 
400 

tct ccg teg ccg age 
Ser Pro Ser Pro Ser 
420 

acg ccg aca gec age 
Thr Pro Thr Ala Ser 
435 

acg ccc acg gca age 
Thr Pro Thr Ala Ser 
450 

cgc tgc acc gcg agt 
Arg Cys Thr Ala Ser 
465 

acg gta acg gtg gee 
Thr Val Thr Val Ala 
480 

tgg acg gtc agt tgg 
Trp Thr Val Ser Trp 
500 

tgg aat gca gcg gtc 
Trp Asn Ala Ala Val 
515 

atg agt tat aac aac 
Met Ser Tyr Asn Asn 
530 

-t-te -eag~ gcg —age— tat 

Phe Gin Ala Ser Tyr 
545 

gca gca agt taa 
Ala Ala Ser 
560 

cccggcttcc acctatggag 
gaacgacacg aatgcccgcc 
tgagcatege agcctccatc 
aegtcgacaa tccctatgcg 
tacagagega acggcgaacc 
cacatattcg acggccgtct 



1891 



1939 



1987 



2035 



2083 



2131 



2179 



2227 



2275 



2323 



2371 



2419 



-24.6.7 



2512 



2572 
2632 
2692 
2752 
2812 
2872 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/981,900 



DATE: 11/08/2001 
TIME: 13:18:37 



Input Set : A:\es.txt 

Output Set: N:\CRF3\11082001\I981900.raw 



294 ggatggaccg catcgctgcg atcaacggcg tcaacggcgg acccggcttg acgacatatc 

296 tggacgccgc cctctcccag cagcagggaa ccacccctga agtcattgag attgtcatct 

298 acgatctgcc gg 

301 <210> SEQ ID NO: 5 

302 <211> LENGTH: 562 

303 <212> TYPE: PRT 

304 <213> ORGANISM: Acidothermus cellulolyticus 
306 <400> SEQUENCE: 5 



308 


Val 


Pro 


Arg 


Ala 


Leu 


Arg 


Arg 


Val 


Pro 


Gly 


Ser 


Arg 


Val 


Met 


Leu 


Arg 


309 


1 








5 










10 










15 




312 


Val 


Gly 


Val 


Val 


Val 


Ala 


Val 


Leu 


Ala 


Leu 


Val 


Ala 


Ala 


Leu 


Ala 


Asn 


313 








20 










25 










30 






316 


Leu 


Ala 


Val 


Pro 


Arg 


Pro 


Ala 


Arg 


Ala 


Ala 


Gly 


Gly 


Gly 


Tyr 


Trp 


His 


317 






35 










40 










45 








320 


Thr 


Ser 


Gly 


Arg 


Glu 


He 


Leu 


Asp 


Ala 


Asn 


Asn 


Val 


Pro 


Val 


Arg 


He 


321 




50 










55 










60 










324 


Ala 


Gly 


He 


Asn 


Trp 


Phe 


Gly 


Phe 


Glu 


Thr 


Cys 


Asn 


Tyr 


Val 


Val 


His 


325 


65 










70 










75 










80 


328 


Gly 


Leu 


Trp 


Ser 


Arg 


Asp 


Tyr 


Arg 


Ser 


Met 


Leu 


Asp 


Gin 


He 


Lys 


Ser 


329 










85 










90 










95 
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Leu 


Gly 


Tyr 


Asn 


Thr 


He 


Arg 


Leu 


Pro 


Tyr 


Ser 


Asp 


Asp 


He 


Leu 


Lys 


333 








100 










105 










110 






336 


Pro 


Gly 


Thr 


Met 


Pro 


Asn 


Ser 


He 


Asn 


Phe 


Tyr 


Gin 


Met 


Asn 


Gin 


Asp 


337 






115 










120 










125 








340 


Leu 


Gin 


Gly 


Leu 


Thr 


Ser 


Leu 


Gin 


Val 


Met 


Asp 


Lys 


He 


Val 


Ala 


Tyr 


341 




130 










135 










140 










344 


Ala 


Gly 


Gin 


He 


Gly 


Leu 


Arg 


He 


He 


Leu 


Asp 


Arg 


His 


Arg 


Pro 


Asp 


345 


145 










150 










155 










160 


348 


Cys 


Ser 


Gly 


Gin 


Ser 


Ala 


Leu 


Trp 


Tyr 


Thr 


Ser 


Ser 


Val 


Ser 


Glu 


Ala 


349 










165 










170 










175 




352 


Thr 


Trp 


He 


Ser 


Asp 


Leu 


Gin 


Ala 


Leu 


Ala 


Gin 


Arg 


Tyr 


Lys 


Gly 


Asn 


353 








180 










185 










190 






356 


Pro 


Thr 


Val 


Val 


Gly 


Phe 


Asp 


Leu 


His 


Asn 


Glu 


Pro 


His 


Asp 


Pro 


Ala 


357 






195 










200 










205 








360 


Cys 


Trp 


Gly 


Cys 


Gly 


Asp 


Pro 


Ser 


He 


Asp 


Trp 


Arg 


Leu 


Ala 


Ala 


Glu 


361 




210 










215 










220 










364 


Arg 


Ala 


Gly 


Asn 


Ala 


Val 


Leu 


Ser 


Val 


Asn 


Pro 


Asn 


Leu 


Leu 


He 


Phe 


365 


"225 










230 










2-35- 










-240 


368 


Val 


Glu 


Gly 


Val 


Gin 


Ser 


Tyr 


Asn 


Gly 


Asp 


Ser 


Tyr 


Trp 


Trp 


Gly 


Gly 


369 










245 










250 










255 




372 


Asn 


Leu 


Gin 


Gly 


Ala 


Gly 


Gin 


Tyr 


Pro 


Val 


Val 


Leu 


Asn 


Val 


Pro 


Asn 


373 








260 










265 










270 






376 


Arg 


Leu 


Val 


Tyr 


Ser 


Ala 


His 


Asp 


Tyr 


Ala 


Thr 


Ser 


Val 


Tyr 


Pro 


Gin 


377 






275 










280 










285 








380 


Thr 


Trp 


Phe 


Ser 


Asp 


Pro 


Thr 


Phe 


Pro 


Asn 


Asn 


Met 


Pro 


Gly 


He 


Trp 


381 




290 










295 










300 










384 


Asn 


Lys 


Asn 


Trp 


Gly 


Tyr 


Leu 


Phe 


Asn 


Gin 


Asn 


He 


Ala 


Pro 


Val 


Trp 


385 


305 










310 










315 










320 


388 


Leu 


Gly 


Glu 


Phe 


Gly 


Thr 


Thr 


Leu 


Gin 


Ser 


Thr 


Thr 


Asp 


Gin 


Thr 


Trp 



2932 
2992 
3004 




Use of n and / or Xaa has been detected in the 
Sequence Listing. Review the Sequence Listing 
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"Xaa" 


used, 
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6 


L 
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SEQ 


ID# 
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L< 
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M 
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